191 research outputs found
Purely Magnetic Spacetimes
Purely magnetic spacetimes, in which the Riemann tensor satisfies
for some unit timelike vector , are studied. The
algebraic consequences for the Weyl and Ricci tensors are examined in detail
and consideration given to the uniqueness of . Some remarks concerning the
nature of the congruence associated with are made.Comment: 12 pages, standard latex. Submitted to Classical and Quantum Gravity
Revisiting End-to-End Speech-to-Text Translation From Scratch
End-to-end (E2E) speech-to-text translation (ST) often depends on pretraining its encoder and/or decoder using source transcripts via speech recognition or text translation tasks, without which translation performance drops substantially. However, transcripts are not always available, and how significant such pretraining is for E2E ST has rarely been studied in the literature. In this paper, we revisit this question and explore the extent to which the quality of E2E ST trained on speech-translation pairs alone can be improved. We reexamine several techniques proven beneficial to ST previously, and offer a set of best practices that biases a Transformer-based E2E ST system toward training from scratch. Besides, we propose parameterized distance penalty to facilitate the modeling of locality in the self-attention model for speech. On four benchmarks covering 23 languages, our experiments show that, without using any transcripts or pretraining, the proposed system reaches and even outperforms previous studies adopting pretraining, although the gap remains in (extremely) low-resource settings. Finally, we discuss neural acoustic feature modeling, where a neural model is designed to extract acoustic features from raw speech signals directly, with the goal to simplify inductive biases and add freedom to the model in describing speech. For the first time, we demonstrate its feasibility and show encouraging results on ST tasks
The WMT Shared Tasks
The annual WMT Conference in Machine Translation has been running shared tasks since 2006. It started with a translation task based on Europarl, and has grown to include tasks on all aspects of MT corpus preparation, training and evaluation, including the flagship task on news translation. I will review the history of the task, lessons learnt, and plans for future tasks
Applying Pairwise Ranked Optimisation to Improve the Interpolation of Translation Models
In Statistical Machine Translation we often have to combine different sources of parallel training data to build a good system. One way of doing this is to build separate translation models from each data set and linearly interpolate them, and to date the main method for optimising the interpolation weights is to minimise the model perplexity on a heldout set. In this work, rather than optimising for this indirect measure, we directly optimise for BLEU on the tuning set and show improvements in average performance over two data sets and 8 language pairs.
Bridging linguistic typology and multilingual machine translation with multi-view language representations
Sparse language vectors from linguistic typology databases and learned
embeddings from tasks like multilingual machine translation have been
investigated in isolation, without analysing how they could benefit from each
other's language characterisation. We propose to fuse both views using singular
vector canonical correlation analysis and study what kind of information is
induced from each source. By inferring typological features and language
phylogenies, we observe that our representations embed typology and strengthen
correlations with language relationships. We then take advantage of our
multi-view language vector space for multilingual machine translation, where we
achieve competitive overall translation accuracy in tasks that require
information about language similarities, such as language clustering and
ranking candidates for multilingual transfer. With our method, we can easily
project and assess new languages without expensive retraining of massive
multilingual or ranking models, which are major disadvantages of related
approaches.Comment: 15 pages, 6 figure
Language Model Prior for Low-Resource Neural Machine Translation
The scarcity of large parallel corpora is an important obstacle for neural
machine translation. A common solution is to exploit the knowledge of language
models (LM) trained on abundant monolingual data. In this work, we propose a
novel approach to incorporate a LM as prior in a neural translation model (TM).
Specifically, we add a regularization term, which pushes the output
distributions of the TM to be probable under the LM prior, while avoiding wrong
predictions when the TM "disagrees" with the LM. This objective relates to
knowledge distillation, where the LM can be viewed as teaching the TM about the
target language. The proposed approach does not compromise decoding speed,
because the LM is used only at training time, unlike previous work that
requires it during inference. We present an analysis of the effects that
different methods have on the distributions of the TM. Results on two
low-resource machine translation datasets show clear improvements even with
limited monolingual data
- …